skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Fu, Yanjie"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available August 3, 2026
  2. Feature transformation aims to reconstruct the feature space of raw features to enhance the performance of downstream models. However, the exponential growth in the combinations of features and operations poses a challenge, making it difficult for existing methods to efficiently explore a wide space. Additionally, their optimization is solely driven by the accuracy of downstream models in specific domains, neglecting the acquisition of general feature knowledge. To fill this research gap, we propose an evolutionary LLM framework for automated feature transformation. This framework consists of two parts: 1) constructing a multi-population database through an RL data collector while utilizing evolutionary algorithm strategies for database maintenance, and 2) utilizing the ability of Large Language Model (LLM) in sequence understanding, we employ few-shot prompts to guide LLM in generating superior samples based on feature transformation sequence distinction. Leveraging the multi-population database initially provides a wide search scope to discover excellent populations. Through culling and evolution, high-quality populations are given greater opportunities, thereby furthering the pursuit of optimal individuals. By integrating LLMs with evolutionary algorithms, we achieve efficient exploration within a vast space, while harnessing feature knowledge to propel optimization, thus realizing a more adaptable search paradigm. Finally, we empirically demonstrate the effectiveness and generality of our proposed method. 
    more » « less
    Free, publicly-accessible full text available April 11, 2026
  3. Free, publicly-accessible full text available February 17, 2026
  4. Abstract This study examines the role of human dynamics within Geospatial Artificial Intelligence (GeoAI), highlighting its potential to reshape the geospatial research field. GeoAI, emerging from the confluence of geospatial technologies and artificial intelligence, is revolutionizing our comprehension of human-environmental interactions. This revolution is powered by large-scale models trained on extensive geospatial datasets, employing deep learning to analyze complex geospatial phenomena. Our findings highlight the synergy between human intelligence and AI. Particularly, the humans-as-sensors approach enhances the accuracy of geospatial data analysis by leveraging human-centric AI, while the evolving GeoAI landscape underscores the significance of human–robot interaction and the customization of GeoAI services to meet individual needs. The concept of mixed-experts GeoAI, integrating human expertise with AI, plays a crucial role in conducting sophisticated data analyses, ensuring that human insights remain at the forefront of this field. This paper also tackles ethical issues such as privacy and bias, which are pivotal for the ethical application of GeoAI. By exploring these human-centric considerations, we discuss how the collaborations between humans and AI transform the future of work at the human-technology frontier and redefine the role of AI in geospatial contexts. 
    more » « less
  5. This paper presents GeoDMA , which processes the GPS data from multiple vehicles to detect anomalous driving maneuvers, such as rapid acceleration, sudden braking, and rapid swerving. First, an unsupervised deep auto-encoder is designed to learn a set of unique features from the normal historical GPS data of all drivers. We consider the temporal dependency of the driving data for individual drivers and the spatial correlation among different drivers. Second, to incorporate the peer dependency of drivers in local regions, we develop a geographical partitioning algorithm to partition a city into several sub-regions to do the driving anomaly detection. Specifically, we extend the vehicle-vehicle dependency to road-road dependency and formulate the geographical partitioning problem into an optimization problem. The objective of the optimization problem is to maximize the dependency of roads within each sub-region and minimize the dependency of roads between any two different sub-regions. Finally, we train a specific driving anomaly detection model for each sub-region and perform in-situ updating of these models by incremental training. We implement GeoDMA in Pytorch and evaluate its performance using a large real-world GPS trajectories. The experiment results demonstrate that GeoDMA achieves up to 8.5% higher detection accuracy than the baseline methods. 
    more » « less
  6. Aidong Zhang; Huzefa Rangwala (Ed.)
    In many scenarios, 1) data streams are generated in real time; 2) labeled data are expensive and only limited labels are available in the beginning; 3) real-world data is not always i.i.d. and data drift over time gradually; 4) the storage of historical streams is limited. This learning setting limits the applicability and availability of many Machine Learning (ML) algorithms. We generalize the learning task under such setting as a semi-supervised drifted stream learning with short lookback problem (SDSL). SDSL imposes two under-addressed challenges on existing methods in semi-supervised learning and continuous learning: 1) robust pseudo-labeling under gradual shifts and 2) anti-forgetting adaptation with short lookback. To tackle these challenges, we propose a principled and generic generation-replay framework to solve SDSL. To achieve robust pseudo-labeling, we develop a novel pseudo-label classification model to leverage supervised knowledge of previously labeled data, unsupervised knowledge of new data, and, structure knowledge of invariant label semantics. To achieve adaptive anti-forgetting model replay, we propose to view the anti-forgetting adaptation task as a flat region search problem. We propose a novel minimax game-based replay objective function to solve the flat region search problem and develop an effective optimization solver. Experimental results demonstrate the effectiveness of the proposed method. 
    more » « less